An even faster algorithm for ridge regression of reduced rank data
نویسنده
چکیده
Hawkins and Yin (Comput. Statist. Data Anal. 40 (2002) 253) describe an algorithm for ridge regression of reduced rank data, i.e. data where p, the number of variables, is larger than n, the number of observations. Whereas a direct implementation of ridge regression in this setting requires calculations of order O(np2 + p3), their algorithm uses only calculations of order O(np2). In this paper, we describe an alternative algorithm based on a factorization of the (transposed) design matrix. This approach is numerically more stable, further reduces the amount of calculations and needs less memory. In particular, we show that the factorization can be calculated in O(n2p) operations. Once the factorization is obtained, for any value of the ridge parameter the ridge regression estimator can be calculated in O(np) operations and the generalized cross-validation score in O(n) operations. © 2004 Elsevier B.V. All rights reserved.
منابع مشابه
Reduced rank ridge regression and its kernel extensions
In multivariate linear regression, it is often assumed that the response matrix is intrinsically of lower rank. This could be because of the correlation structure among the prediction variables or the coefficient matrix being lower rank. To accommodate both, we propose a reduced rank ridge regression for multivariate linear regression. Specifically, we combine the ridge penalty with the reduced...
متن کاملTopics on Reduced Rank Methods for Multivariate Regression
Topics in Reduced Rank methods for Multivariate Regression by Ashin Mukherjee Advisors: Professor Ji Zhu and Professor Naisyin Wang Multivariate regression problems are a simple generalization of the univariate regression problem to the situation where we want to predict q(> 1) responses that depend on the same set of features or predictors. Problems of this type is encountered commonly in many...
متن کاملSharper Bounds for Regression and Low-Rank Approximation with Regularization
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of th...
متن کاملSharper Bounds for Regularized Data Fitting
We study matrix sketching methods for regularized variants of linear regression, low rank approximation, and canonical correlation analysis. Our main focus is on sketching techniques which preserve the objective function value for regularized problems, which is an area that has remained largely unexplored. We study regularization both in a fairly broad setting, and in the specific context of th...
متن کاملRidge Regression and Provable Deterministic Ridge Leverage Score Sampling
Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 50 شماره
صفحات -
تاریخ انتشار 2006